Searching for Credible Relations in Machine Learning
نویسنده
چکیده
When machine learning (ML) and data mining (DM) methods construct models in complex domains, models can contain less-credible parts [2], which are statistically significant, but meaningless to the human analyst. For example, let us consider a decision tree model presented in Figure 1. The tree is constructed with the J48 algorithm in Weka [8] for a complex domain indicating which segments of research and development (R&D) sector have the highest impact on economic welfare of a country. Nodes in the tree represent segments of the R&D sector. Leaves in the tree represent economic welfare of the majority of countries that reached the specific leaf. Economic welfare can be: low, middle or high. In each leaf, the first number in brackets represents the number of countries that reached that leaf. The second number represents the number of countries in that leaf with the level of welfare different than the one represented by the leaf. The quantities are expressed in decimals to account for those countries with missing values for segments appearing in the tree. Note that the left subtree is omitted to simplify the example.
منابع مشابه
The machine learning process in applying spatial relations of residential plans based on samples and adjacency matrix
The current world is moving towards the development of hardware or software presence of artificial intelligence in all fields of human work, and architecture is no exception. Now this research seeks to present a theoretical and practical model of intuitive design intelligence that shows the problem of learning layout and spatial relationships to artificial intelligence algorithms; Therefore, th...
متن کاملRelational Databases Query Optimization using Hybrid Evolutionary Algorithm
Optimizing the database queries is one of hard research problems. Exhaustive search techniques like dynamic programming is suitable for queries with a few relations, but by increasing the number of relations in query, much use of memory and processing is needed, and the use of these methods is not suitable, so we have to use random and evolutionary methods. The use of evolutionary methods, beca...
متن کاملReachability checking in complex and concurrent software systems using intelligent search methods
Software system verification is an efficient technique for ensuring the correctness of a software product, especially in safety-critical systems in which a small bug may have disastrous consequences. The goal of software verification is to ensure that the product fulfills the requirements. Studies show that the cost of finding and fixing errors in design time is less than finding and fixing the...
متن کاملCombining human analysis and machine data mining to obtain credible data relations
Can a model constructed using data mining (DM) programs be trusted? It is known that a decision-tree model can contain relations that are statistically significant, but, in reality, meaningless to a human. When the task is domain analysis, meaningless relations are problematic, since they can lead to wrong conclusions and can consequently undermine a human’s trust in DM programs. To eliminate p...
متن کاملIdentification Psychological Disorders Based on Data in Virtual Environments Using Machine Learning
Introduction: Psychological disorders is one of the most problematic and important issue in today's society. Early prognosis of these disorders matters because receiving professional help at the appropriate time could improve the quality of life of these patients. Recently, researches use social media as a form of new tools in identifying psychological disorder. It seems that through the use of...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- Informatica (Slovenia)
دوره 37 شماره
صفحات -
تاریخ انتشار 2013